Biclustering Using Message Passing
نویسندگان
چکیده
Biclustering is the analog of clustering on a bipartite graph. Existent methods infer biclusters through local search strategies that find one cluster at a time; a common technique is to update the row memberships based on the current column memberships, and vice versa. We propose a biclustering algorithm that maximizes a global objective function using message passing. Our objective function closely approximates a general likelihood function, separating a cluster size penalty term into rowand column-count penalties. Because we use a global optimization framework, our approach excels at resolving the overlaps between biclusters, which are important features of biclusters in practice. Moreover, Expectation-Maximization can be used to learn the model parameters if they are unknown. In simulations, we find that our method outperforms two of the best existing biclustering algorithms, ISA and LAS, when the planted clusters overlap. Applied to three gene expression datasets, our method finds coregulated gene clusters that have high quality in terms of cluster size and density.
منابع مشابه
Exemplar-based Robust Coherent Biclustering
The biclustering, co-clustering, or subspace clustering problem involves simultaneously grouping the rows and columns of a data matrix to uncover biclusters or sub-matrices of the data matrix that optimize a desired objective function. In coherent biclustering, the objective function contains a coherence measure of the biclusters. We introduce a novel formulation of the coherent biclustering pr...
متن کاملSubmatrix localization via message passing
The principal submatrix localization problem deals with recovering a K ×K principal submatrix of elevated mean μ in a large n × n symmetric matrix subject to additive standard Gaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime Ω( √ n) ≤ K ≤ ...
متن کاملScalable Co-clustering Algorithms
Co-clustering has been extensively used in varied applications because of its potential to discover latent local patterns that are otherwise unapparent by usual unsupervised algorithms such as k-means. Recently, a unified view of co-clustering algorithms, called Bregman co-clustering (BCC), provides a general framework that even contains several existing co-clustering algorithms, thus we expect...
متن کاملTowards Abstraction of Message Passing Programming [1]
Data-parallel applications are usually programmed in the SPMD paradigm by using a message passing system such as MPI or PVM. However programming by using message passing primitives is still tedious and error-prone. This paper presents an abstraction of message passing programming in C++ to relieve programmers of low-level considerations. The runtime overhead introduced by the abstraction is sho...
متن کاملReuse , Portability and Parallel
Parallel programs are typically written in an explicitly parallel fashion using either message passing or shared memory primitives. Message passing is attractive for performance and portability since shared memory machines can eeciently execute message passing programs, however message passing machines cannot in general eeectively execute shared memory programs. In order to write a parallel pro...
متن کامل